AI: streaming-safe middleware, agent-driven Runner, ModelBuilder tool binding by bedus-creation · Pull Request #139 · fastapi-startkit/fastapi-startkit-framework

bedus-creation · 2026-06-23T06:44:59Z

Summary

Hardens the AI agent harness: middleware no longer breaks streaming, tool binding/execution have one clear owner each, and the Runner is driven by the agent. Tests are realigned to the current API.

Streaming through middleware (the main fix)

With middleware attached, .stream() used to buffer the whole response. A middleware written the natural way — final = await handler(model) — drains the entire Response stream before returning, so:

the first token only appeared after the full generation (seconds of latency instead of sub-second first-token), and
after-hooks (e.g. response logging) fired only at the very end.

Fix: build_pipeline now hands every middleware a Response-returning handler, so a layer can attach .then(callback) and return it without awaiting. Awaiting still works for prompt() (buffered), but streaming stays token-by-token and the after-hook fires exactly once on completion (post-stream for stream(), on the final message for prompt()).

Middleware may be sync or async — both supported. The canonical form is now sync:

def handle(self, model, handler):
    ...  # before
    return handler(model).then(after_fn)   # don't await — streaming-safe

build_pipeline checks isinstance(_, Response) before the awaitable check, since Response is itself awaitable (otherwise a sync handle returning a Response would be re-buffered).
Example AgentLogger converted to sync + .then().

Tool binding / execution ownership

ModelBuilder binds agent.tools() onto the chat model (what the LLM needs to emit tool calls).
Runner no longer binds — it keeps the name → BaseTool map purely to execute returned tool calls. bind_tools stores serialized schemas, not callables, so execution still needs the real tools.

Runner takes the agent

Runner(agent, model) / StreamRunner(agent, model) instead of threading tools/max_steps through every call site. model stays a separate arg because the middleware pipeline can transform it.

Typing & API cleanup

Agent.tools() typed as list[BaseTool] (lazy TYPE_CHECKING import) — fixes the tool.name/dict[str, BaseTool] type errors.
@model decorator now sets model (it previously set an unread _model, making the decorator a silent no-op).

Tests

Realigned test_agent.py / test_agent_decorators.py to the current API: instructions() method, provider attribute, ModelBuilder._resolve_model; removed the deleted memory decorator.
Added regression tests: streaming-through-middleware streams token-by-token with a single after-hook, and the prompt after-hook fires.

Verification

pytest tests/ai/ → 191 passed.
Full suite (excluding live-Postgres tests) → 1541 passed, 7 skipped.

🤖 Generated with Claude Code

…I intact Swap the per-provider SDK internals of ai/agent.py (_run/_stream over the anthropic/openai/google SDKs) for a single LangChain/LangGraph backend: init_chat_model builds the chat model and create_agent drives the tool loop, with the final AIMessage mapped back to AgentResponse. The user-facing surface is unchanged — prompt/stream/fake/assert_prompted/assert_not_prompted/reset, the lifecycle hooks, and the decorators keep identical signatures. - _build_model() is the seam tests patch to inject a fake chat model. - _build_messages() now renders attachments via Document.to_langchain_block(). - Add Document.to_langchain_block(): inline text, base64 image/file blocks. - Add ai/fakes.py fake_chat_model(): replays scripted AIMessage turns through a GenericFakeChatModel (bind_tools no-op) so the real create_agent loop runs offline; exported from the ai package root. - New optional [langgraph] extra (langchain + langchain-core + langgraph). The 23 tests in tests/ai/test_agent_fake.py stay green and unmodified (fake() short-circuits before the backend). Adds tests/ai/test_agent_langgraph_backend.py exercising the real loop offline: simple reply, full tool-calling loop, usage mapping, attachment blocks, provider mapping, and streaming.

…ent through it Turn ai/config.py into a config package and resolve models/providers through a new Lab helper instead of hardcoded dicts on the Agent. - ai/config/: split provider dataclasses (config.py) from the top-level AIConfig (ai.py), add config/__init__.py re-exporting them, and give each provider a models map keyed by modality (default / default_image / default_audio / default_transcribe). Fix the draft's circular import (AIConfig imported the provider configs from the package root mid-init) and the placeholder model values (google text default, elevenlabs models). - AIConfig selects the default provider per modality: default (text), default_image, default_audio, default_transcribe. image.py/audio.py now read default_image/default_audio (was image_provider/audio_provider). - ai/lab.py: Lab(StrEnum) + ModelType resolve the provider, default model, and the "<langchain-provider>:<model>" URL from Config (google → google_genai). - Agent: _resolve_model() and _build_model() now go through Lab; removed the stale _DEFAULT_MODELS/_LANGCHAIN_PROVIDERS references and the dead _execute_tool() (create_agent runs tools itself). - Tests: drop the _build_model monkeypatch helper; the backend tests now patch the real langchain.chat_models.init_chat_model seam via pytest monkeypatch. Add test_lab.py; update image/audio provider-selection mocks. The 23 tests in tests/ai/test_agent_fake.py stay green and unmodified.

…l loop Replace the LangGraph create_agent backend with a plain init_chat_model call driven by a Runner that resolves and executes tool calls itself. - runner.py: Runner(model, tools, max_steps) binds tools, invokes the model, executes requested tool calls, feeds results back, loops to a final answer; StreamRunner yields content tokens through the same loop. Fully typed. - Agent._run/_stream delegate to Runner/StreamRunner (threading _max_steps); no create_agent. - System message is declarative via instructions()/_instructions — removed the per-call system= and messages= arguments from prompt()/stream(). - Resolve provider/model through Lab directly (dropped the _lab() helper) and import Lab at module top. - Config split into ai/config/{ai,config}.py. KNOWN RED: ai/config/__init__.py is absent, so 'from fastapi_startkit.ai.config import AIConfig' fails — the AI test suite does not collect and AIProvider import breaks. Backend tests also still assume the old create_agent result shape and the tuple return of _build_messages. Follow-ups.

Add a class-level testing harness so an agent can be faked or recorded without a real model provider: - Agent.fake({...}) and Agent.record(path) return an AgentBinding usable as a context manager or test decorator. The binding swaps a stand-in into the service container under the agent's class name and auto-resets on exit, so even a controller's own ChatAgent().prompt(...) is covered. - FakeAgent answers from glob patterns; RecordingAgent records the real reply to JSON once, then replays it. Both expose assert_prompted(). - Agent.make()/faked() resolve the bound stand-in for assertions. - prompt()/stream() delegate to an active binding; the in-process agent.fake({...}) instance API is preserved via a dual-purpose accessor. - Lab.ModelType carries the models-map key as a static mapping. - Import AIConfig from the fastapi_startkit.ai namespace in tests and the AI facade stub, matching the provider registration. Tests: full suite 1541 passed, 7 skipped; ruff clean.

Convert bare assert statements to self.assertEqual in the example/agents feature tests, matching the unittest.IsolatedAsyncioTestCase base. Keep the async test methods and the @ChatAgent.fake / @ChatAgent.record decorators intact.

…sertions test(agents): use unittest assertions in example feature tests

…Lab default fallback - Replace the FakeAccessor descriptor with a single Agent.fake() classmethod; drop the unused faked()/bound aliases and rename the internal stand-in resolver to _faked(), sharing container lookup via _binding(). - Runner.run() now returns the tool result directly instead of looping the output back to the model (custom single-shot tool semantics). - Resolve provider via Lab.get_provider(self._provider) so a None provider falls back to the configured default instead of raising. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Convert the AI tests from pytest function style to unittest.TestCase classes (fake/record, decorators, lab, config, provider, document, response, agent). Rename test_agent_langgraph_backend.py to test_agent.py, add a TestAgentRecord class for record-and-replay, and align expectations with the actual config default provider (google). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ckend AIProvider.register() now resolves AIConfig and merges it into the config store (merge_config_from) instead of binding into the container and setting it in boot(); boot() is a no-op. Drop the unused _memory_backend class attribute on Agent. Update provider tests to the new behaviour. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Agent.prompt() and stream() are now coroutines/async generators, matching the framework's async-first design. The Runner uses ainvoke/astream/ainvoke for tools, the fake/record stand-ins and AgentSnapshot.resolve are async, and the fake/record/agent tests run under IsolatedAsyncioTestCase. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Drop the bundled provider SDKs (anthropic, openai, google-generativeai) and the unused langgraph from the [ai] extra and dev group — providers are pulled lazily by init_chat_model and are now opt-in. Fix stale langgraph references in the fake-model helper to point at the [ai] extra. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@model

…n ModelBuilder Streaming previously buffered the entire response when an agent had middleware: `final = await handler(model)` drained the Response stream before returning, so the first token only appeared after the full generation and after-hooks fired late. build_pipeline now hands each middleware a Response-returning handler so layers can attach `.then(callback)` and return without awaiting — streaming-safe, and the after-hook fires once on completion (buffered for prompt, post-stream for stream). Middleware may be sync or async; the example AgentLogger is now sync + `.then()`. Other AI changes: - ModelBuilder binds tools (agent.tools()) onto the chat model; Runner no longer binds, only keeps the tool map for execution. - Runner takes the agent (Runner(agent, model)) instead of threading tools/max_steps separately. - Agent.tools() typed as list[BaseTool]; @model decorator sets `model` (was the unread `_model`). - Tests updated to the current API (instructions() method, provider attr, ModelBuilder._resolve_model) and drop the removed `memory` decorator; add regression tests for streaming-through-middleware and the prompt after-hook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Add direct tests for ai/pipeline.py — the Response deferred-callback mechanism and build_pipeline onion that back agent middleware. Locks down that the .then after-hook fires exactly once on both the buffered (await) and streaming (async for) consumption paths, modelled on the AgentLogger request/response logging pattern. Brings pipeline.py to 100% coverage.

bedus-creation force-pushed the task/langgraph-agent-harness branch from beedb37 to 990e842 Compare June 23, 2026 07:01

bedus-creation changed the title ~~LangGraph agent test harness: Agent.fake() + Agent.record()~~ Back the Agent with LangChain/LangGraph (identical public API) Jun 23, 2026

bedus-creation changed the title ~~Back the Agent with LangChain/LangGraph (identical public API)~~ AI: LangGraph-backed Agent + config package & Lab resolver Jun 23, 2026

bedus-creation changed the title ~~AI: LangGraph-backed Agent + config package & Lab resolver~~ AI: init_chat_model backend + Runner tool loop, config & Lab Jun 23, 2026

bedus-creation changed the title ~~AI: init_chat_model backend + Runner tool loop, config & Lab~~ AI: init_chat_model backend, Runner tool loop, and fake()/record() testing harness Jun 23, 2026

bedus-creation force-pushed the task/langgraph-agent-harness branch from 0278c05 to e31432b Compare June 23, 2026 20:49

bedus-creation force-pushed the task/langgraph-agent-harness branch from e31432b to dbfc847 Compare June 23, 2026 20:57

bedus-creation mentioned this pull request Jun 24, 2026

test(agents): use unittest assertions in example feature tests #141

Merged

bedus-creation and others added 10 commits June 23, 2026 23:45

Merge pull request #141 from fastapi-startkit/task/agents-unittest-as…

229ce43

…sertions test(agents): use unittest assertions in example feature tests

chore: ignore node_modules

2f91623

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

style(ai): drop explanatory comments from source

a08a4bf

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chore(ai): default gemini model to gemini-2.5-flash-lite

7d4254c

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

bedus-creation changed the title ~~AI: init_chat_model backend, Runner tool loop, and fake()/record() testing harness~~ AI: streaming-safe middleware, agent-driven Runner, ModelBuilder tool binding Jun 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI: streaming-safe middleware, agent-driven Runner, ModelBuilder tool binding#139

AI: streaming-safe middleware, agent-driven Runner, ModelBuilder tool binding#139
bedus-creation wants to merge 16 commits into
mainfrom
task/langgraph-agent-harness

bedus-creation commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bedus-creation commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Streaming through middleware (the main fix)

Tool binding / execution ownership

Runner takes the agent

Typing & API cleanup

Tests

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bedus-creation commented Jun 23, 2026 •

edited

Loading